Skip to content

Conversation

TheDarkula
Copy link
Contributor

r? @Zoxc

Built the regex crate, commit 60d087a23025e045ae754a345b04003c31d83d93.

Serial:

  time: 0.067; rss: 62MB	parsing
  time: 0.000; rss: 62MB	attributes injection
  time: 0.000; rss: 62MB	recursion limit
  time: 0.000; rss: 62MB	crate injection
  time: 0.000; rss: 62MB	plugin loading
  time: 0.000; rss: 62MB	plugin registration
  time: 0.003; rss: 62MB	pre ast expansion lint checks
    time: 0.168; rss: 96MB	expand crate
    time: 0.000; rss: 96MB	check unused macros
  time: 0.168; rss: 96MB	expansion
  time: 0.000; rss: 96MB	maybe building test harness
  time: 0.002; rss: 96MB	AST validation
  time: 0.000; rss: 96MB	maybe creating a macro crate
  time: 0.047; rss: 102MB	name resolution
  time: 0.015; rss: 102MB	complete gated feature checking
  time: 0.035; rss: 110MB	lowering ast -> hir
  time: 0.006; rss: 111MB	early lint checks
    time: 0.006; rss: 114MB	validate hir map
  time: 0.030; rss: 114MB	indexing hir
  time: 0.000; rss: 114MB	load query result cache
  time: 0.000; rss: 114MB	dep graph tcx init
  time: 0.000; rss: 114MB	looking for entry point
  time: 0.000; rss: 114MB	looking for plugin registrar
  time: 0.000; rss: 114MB	looking for derive registrar
  time: 0.002; rss: 114MB	loop checking
  time: 0.002; rss: 116MB	attribute checking
    time: 0.000; rss: 119MB	builtin::check_trait checking
  time: 0.036; rss: 137MB	stability checking
  time: 0.033; rss: 147MB	type collecting
  time: 0.002; rss: 147MB	impl wf inference
    time: 0.000; rss: 147MB	builtin::check_trait checking
    time: 0.000; rss: 147MB	builtin::check_trait checking
    time: 0.000; rss: 147MB	builtin::check_trait checking
    time: 0.000; rss: 147MB	builtin::check_trait checking
    time: 0.000; rss: 147MB	builtin::check_trait checking
    time: 0.001; rss: 147MB	builtin::check_trait checking
    time: 0.000; rss: 147MB	builtin::check_trait checking
    time: 0.000; rss: 147MB	builtin::check_trait checking
    time: 0.000; rss: 149MB	builtin::check_trait checking
    time: 0.000; rss: 149MB	builtin::check_trait checking
    time: 0.000; rss: 150MB	builtin::check_trait checking
    time: 0.000; rss: 150MB	builtin::check_trait checking
    time: 0.000; rss: 150MB	builtin::check_trait checking
    time: 0.000; rss: 150MB	builtin::check_trait checking
    time: 0.000; rss: 151MB	builtin::check_trait checking
    time: 0.000; rss: 151MB	builtin::check_trait checking
    time: 0.000; rss: 151MB	builtin::check_trait checking
    time: 0.000; rss: 151MB	builtin::check_trait checking
    time: 0.000; rss: 151MB	builtin::check_trait checking
    time: 0.000; rss: 151MB	builtin::check_trait checking
    time: 0.000; rss: 154MB	builtin::check_trait checking
    time: 0.000; rss: 154MB	builtin::check_trait checking
    time: 0.000; rss: 154MB	unsafety checking
    time: 0.000; rss: 154MB	orphan checking
  time: 0.046; rss: 154MB	coherence checking
  time: 0.125; rss: 157MB	wf checking
  time: 0.051; rss: 164MB	item-types checking
  time: 0.816; rss: 198MB	item-bodies checking
    time: 0.095; rss: 202MB	rvalue promotion
    time: 0.003; rss: 202MB	intrinsic checking
    time: 0.017; rss: 204MB	match checking
    time: 0.013; rss: 204MB	liveness checking
  time: 0.128; rss: 204MB	misc checking
  time: 0.604; rss: 238MB	borrow checking
  time: 0.002; rss: 241MB	MIR borrow checking
  time: 0.000; rss: 241MB	dumping chalk-like clauses
  time: 0.000; rss: 241MB	MIR effect checking
  time: 0.000; rss: 241MB	layout testing
    time: 0.045; rss: 245MB	privacy checking
    time: 0.009; rss: 246MB	death checking
    time: 0.002; rss: 246MB	unused lib feature checking
    time: 0.029; rss: 246MB	lint checking
  time: 0.084; rss: 246MB	misc checking
  time: 0.000; rss: 246MB	resolving dependency formats
        time: 0.002; rss: 251MB	collecting roots
        time: 0.386; rss: 281MB	collecting mono items
      time: 0.388; rss: 281MB	monomorphization collection
      time: 0.022; rss: 288MB	codegen unit partitioning
    time: 0.524; rss: 293MB	write metadata
    time: 0.000; rss: 301MB	llvm function passes [regex.daejiqas-cgu.11]
    time: 0.000; rss: 315MB	llvm function passes [regex.daejiqas-cgu.7]
    time: 0.261; rss: 345MB	llvm module passes [regex.daejiqas-cgu.11]
    time: 0.001; rss: 345MB	llvm function passes [regex.daejiqas-cgu.0]
    time: 0.015; rss: 346MB	llvm module passes [regex.daejiqas-cgu.0]
    time: 0.241; rss: 347MB	llvm module passes [regex.daejiqas-cgu.7]
    time: 0.001; rss: 385MB	llvm function passes [regex.daejiqas-cgu.10]
    time: 0.397; rss: 396MB	codegen passes [regex.daejiqas-cgu.0]
    time: 0.588; rss: 418MB	codegen passes [regex.daejiqas-cgu.11]
    time: 0.000; rss: 407MB	llvm function passes [regex.daejiqas-cgu.15]
    time: 0.017; rss: 408MB	llvm module passes [regex.daejiqas-cgu.15]
    time: 0.001; rss: 419MB	llvm function passes [regex.daejiqas-cgu.3]
    time: 0.010; rss: 420MB	llvm module passes [regex.daejiqas-cgu.3]
    time: 0.405; rss: 423MB	llvm module passes [regex.daejiqas-cgu.10]
    time: 0.357; rss: 432MB	codegen passes [regex.daejiqas-cgu.15]
    time: 0.943; rss: 432MB	codegen passes [regex.daejiqas-cgu.7]
    time: 0.262; rss: 432MB	codegen passes [regex.daejiqas-cgu.3]
    time: 0.000; rss: 433MB	llvm function passes [regex.daejiqas-cgu.2]
    time: 0.012; rss: 414MB	llvm module passes [regex.daejiqas-cgu.2]
    time: 0.000; rss: 414MB	llvm function passes [regex.daejiqas-cgu.1]
    time: 0.005; rss: 414MB	llvm module passes [regex.daejiqas-cgu.1]
    time: 0.001; rss: 421MB	llvm function passes [regex.daejiqas-cgu.6]
    time: 0.003; rss: 421MB	llvm module passes [regex.daejiqas-cgu.6]
    time: 0.263; rss: 422MB	codegen passes [regex.daejiqas-cgu.2]
    time: 0.269; rss: 423MB	codegen passes [regex.daejiqas-cgu.1]
    time: 0.000; rss: 427MB	llvm function passes [regex.daejiqas-cgu.4]
    time: 0.011; rss: 427MB	llvm module passes [regex.daejiqas-cgu.4]
    time: 0.000; rss: 432MB	llvm function passes [regex.daejiqas-cgu.8]
    time: 0.003; rss: 432MB	llvm module passes [regex.daejiqas-cgu.8]
    time: 0.259; rss: 434MB	codegen passes [regex.daejiqas-cgu.6]
    time: 0.000; rss: 434MB	llvm function passes [regex.daejiqas-cgu.5]
    time: 0.002; rss: 434MB	llvm module passes [regex.daejiqas-cgu.5]
    time: 0.297; rss: 434MB	codegen passes [regex.daejiqas-cgu.4]
    time: 0.000; rss: 440MB	llvm function passes [regex.daejiqas-cgu.9]
    time: 0.003; rss: 440MB	llvm module passes [regex.daejiqas-cgu.9]
    time: 0.243; rss: 440MB	codegen passes [regex.daejiqas-cgu.8]
    time: 0.000; rss: 442MB	llvm function passes [regex.daejiqas-cgu.14]
    time: 0.002; rss: 442MB	llvm module passes [regex.daejiqas-cgu.14]
    time: 0.238; rss: 442MB	codegen passes [regex.daejiqas-cgu.5]
    time: 1.910; rss: 443MB	codegen to LLVM IR
    time: 0.000; rss: 443MB	assert dep graph
    time: 0.000; rss: 443MB	serialize dep graph
  time: 3.023; rss: 443MB	codegen
    time: 0.000; rss: 443MB	llvm function passes [regex.daejiqas-cgu.12]
    time: 0.005; rss: 443MB	llvm module passes [regex.daejiqas-cgu.12]
    time: 0.222; rss: 437MB	codegen passes [regex.daejiqas-cgu.9]
    time: 0.000; rss: 437MB	llvm function passes [regex.daejiqas-cgu.13]
    time: 0.002; rss: 437MB	llvm module passes [regex.daejiqas-cgu.13]
    time: 0.172; rss: 437MB	codegen passes [regex.daejiqas-cgu.12]
    time: 0.125; rss: 437MB	codegen passes [regex.daejiqas-cgu.13]
    time: 0.270; rss: 437MB	codegen passes [regex.daejiqas-cgu.14]
    time: 1.412; rss: 449MB	codegen passes [regex.daejiqas-cgu.10]
  time: 2.799; rss: 445MB	LLVM passes
  time: 0.000; rss: 446MB	serialize work products
  time: 0.028; rss: 446MB	linking
Self profiling results for regex:

| Phase                                     | Time (ms)      | Time (%) | Queries        | Hits (%)
| ----------------------------------------- | -------------- | -------- | -------------- | --------
| Codegen                                   |           3036 |    52.49 |         124616 |    89.76
| TypeChecking                              |           1383 |    23.91 |        1008274 |    91.97
| Other                                     |            980 |    16.94 |        1644015 |    94.66
| Expansion                                 |            168 |     2.90 |              0 |     0.00
| BorrowChecking                            |            109 |     1.88 |          13895 |    65.21
| Parsing                                   |             67 |     1.16 |              0 |     0.00
| Linking                                   |             41 |     0.71 |          17357 |    88.70

Optimization level: No
Incremental: off

Parallel:

  time: 0.055; rss: 64MB	parsing
  time: 0.000; rss: 64MB	attributes injection
  time: 0.000; rss: 64MB	recursion limit
  time: 0.000; rss: 64MB	crate injection
  time: 0.000; rss: 64MB	plugin loading
  time: 0.000; rss: 64MB	plugin registration
  time: 0.003; rss: 64MB	pre ast expansion lint checks
    time: 0.123; rss: 97MB	expand crate
    time: 0.000; rss: 97MB	check unused macros
  time: 0.123; rss: 97MB	expansion
  time: 0.000; rss: 97MB	maybe building test harness
  time: 0.002; rss: 97MB	AST validation
  time: 0.000; rss: 97MB	maybe creating a macro crate
  time: 0.053; rss: 103MB	name resolution
  time: 0.016; rss: 103MB	complete gated feature checking
  time: 0.040; rss: 112MB	lowering ast -> hir
  time: 0.008; rss: 112MB	early lint checks
    time: 0.003; rss: 115MB	validate hir map
  time: 0.027; rss: 115MB	indexing hir
  time: 0.000; rss: 115MB	load query result cache
  time: 0.000; rss: 116MB	dep graph tcx init
  time: 0.000; rss: 116MB	looking for entry point
  time: 0.000; rss: 116MB	looking for plugin registrar
  time: 0.000; rss: 116MB	looking for derive registrar
  time: 0.003; rss: 116MB	loop checking
  time: 0.010; rss: 118MB	attribute checking
    time: 0.000; rss: 121MB	builtin::check_trait checking
  time: 0.052; rss: 139MB	stability checking
  time: 0.039; rss: 151MB	type collecting
  time: 0.002; rss: 151MB	impl wf inference
    time: 0.000; rss: 151MB	builtin::check_trait checking
    time: 0.000; rss: 151MB	builtin::check_trait checking
    time: 0.000; rss: 151MB	builtin::check_trait checking
    time: 0.000; rss: 151MB	builtin::check_trait checking
    time: 0.000; rss: 152MB	builtin::check_trait checking
    time: 0.001; rss: 152MB	builtin::check_trait checking
    time: 0.000; rss: 152MB	builtin::check_trait checking
    time: 0.000; rss: 152MB	builtin::check_trait checking
    time: 0.000; rss: 153MB	builtin::check_trait checking
    time: 0.000; rss: 154MB	builtin::check_trait checking
    time: 0.000; rss: 154MB	builtin::check_trait checking
    time: 0.000; rss: 154MB	builtin::check_trait checking
    time: 0.000; rss: 157MB	builtin::check_trait checking
    time: 0.000; rss: 157MB	builtin::check_trait checking
    time: 0.000; rss: 157MB	builtin::check_trait checking
    time: 0.000; rss: 157MB	builtin::check_trait checking
    time: 0.000; rss: 157MB	builtin::check_trait checking
    time: 0.000; rss: 157MB	builtin::check_trait checking
    time: 0.000; rss: 157MB	builtin::check_trait checking
    time: 0.000; rss: 157MB	builtin::check_trait checking
    time: 0.000; rss: 157MB	builtin::check_trait checking
    time: 0.000; rss: 157MB	builtin::check_trait checking
    time: 0.000; rss: 157MB	unsafety checking
    time: 0.000; rss: 157MB	orphan checking
  time: 0.047; rss: 157MB	coherence checking
  time: 0.129; rss: 159MB	wf checking
  time: 0.053; rss: 164MB	item-types checking
  time: 0.352; rss: 200MB	item-bodies checking
  time: 0.009; rss: 203MB	intrinsic checking
  time: 0.030; rss: 204MB	liveness checking
  time: 0.067; rss: 208MB	match checking
    time: 0.078; rss: 211MB	rvalue promotion
  time: 0.079; rss: 211MB	misc checking
  time: 0.305; rss: 246MB	borrow checking
  time: 0.002; rss: 246MB	MIR borrow checking
  time: 0.000; rss: 246MB	dumping chalk-like clauses
  time: 0.000; rss: 246MB	MIR effect checking
  time: 0.000; rss: 246MB	layout testing
    time: 0.046; rss: 251MB	privacy checking
  time: 0.049; rss: 251MB	unused lib feature checking
  time: 0.059; rss: 251MB	death checking
  time: 0.079; rss: 251MB	lint checking
  time: 0.079; rss: 251MB	misc checking
  time: 0.000; rss: 251MB	resolving dependency formats
        time: 0.003; rss: 253MB	collecting roots
        time: 0.226; rss: 288MB	collecting mono items
      time: 0.229; rss: 288MB	monomorphization collection
      time: 0.022; rss: 296MB	codegen unit partitioning
    time: 0.372; rss: 300MB	write metadata
    time: 0.000; rss: 313MB	llvm function passes [regex.daejiqas-cgu.11]
    time: 0.000; rss: 327MB	llvm function passes [regex.daejiqas-cgu.7]
    time: 0.266; rss: 354MB	llvm module passes [regex.daejiqas-cgu.11]
    time: 0.001; rss: 355MB	llvm function passes [regex.daejiqas-cgu.0]
    time: 0.009; rss: 355MB	llvm module passes [regex.daejiqas-cgu.0]
    time: 0.235; rss: 361MB	llvm module passes [regex.daejiqas-cgu.7]
    time: 0.001; rss: 409MB	llvm function passes [regex.daejiqas-cgu.10]
    time: 0.495; rss: 409MB	codegen passes [regex.daejiqas-cgu.0]
    time: 0.564; rss: 413MB	codegen passes [regex.daejiqas-cgu.11]
    time: 0.000; rss: 402MB	llvm function passes [regex.daejiqas-cgu.15]
    time: 0.017; rss: 405MB	llvm module passes [regex.daejiqas-cgu.15]
    time: 0.001; rss: 427MB	llvm function passes [regex.daejiqas-cgu.3]
    time: 0.011; rss: 429MB	llvm module passes [regex.daejiqas-cgu.3]
    time: 0.434; rss: 437MB	llvm module passes [regex.daejiqas-cgu.10]
    time: 0.348; rss: 438MB	codegen passes [regex.daejiqas-cgu.15]
    time: 0.974; rss: 446MB	codegen passes [regex.daejiqas-cgu.7]
    time: 0.247; rss: 448MB	codegen passes [regex.daejiqas-cgu.3]
    time: 0.000; rss: 441MB	llvm function passes [regex.daejiqas-cgu.2]
    time: 0.001; rss: 441MB	llvm function passes [regex.daejiqas-cgu.1]
    time: 0.006; rss: 436MB	llvm module passes [regex.daejiqas-cgu.1]
    time: 0.014; rss: 437MB	llvm module passes [regex.daejiqas-cgu.2]
    time: 0.001; rss: 440MB	llvm function passes [regex.daejiqas-cgu.6]
    time: 0.003; rss: 440MB	llvm module passes [regex.daejiqas-cgu.6]
    time: 0.244; rss: 446MB	codegen passes [regex.daejiqas-cgu.1]
    time: 0.304; rss: 447MB	codegen passes [regex.daejiqas-cgu.2]
    time: 0.000; rss: 448MB	llvm function passes [regex.daejiqas-cgu.4]
    time: 0.011; rss: 448MB	llvm module passes [regex.daejiqas-cgu.4]
    time: 0.266; rss: 454MB	codegen passes [regex.daejiqas-cgu.6]
    time: 0.000; rss: 454MB	llvm function passes [regex.daejiqas-cgu.8]
    time: 0.007; rss: 454MB	llvm module passes [regex.daejiqas-cgu.8]
    time: 0.000; rss: 457MB	llvm function passes [regex.daejiqas-cgu.9]
    time: 0.002; rss: 457MB	llvm module passes [regex.daejiqas-cgu.9]
    time: 0.336; rss: 461MB	codegen passes [regex.daejiqas-cgu.4]
    time: 0.208; rss: 461MB	codegen passes [regex.daejiqas-cgu.9]
    time: 0.254; rss: 461MB	codegen passes [regex.daejiqas-cgu.8]
    time: 0.000; rss: 461MB	llvm function passes [regex.daejiqas-cgu.5]
    time: 0.005; rss: 461MB	llvm module passes [regex.daejiqas-cgu.5]
    time: 0.000; rss: 461MB	llvm function passes [regex.daejiqas-cgu.12]
    time: 0.002; rss: 461MB	llvm module passes [regex.daejiqas-cgu.12]
    time: 2.132; rss: 464MB	codegen to LLVM IR
    time: 0.000; rss: 464MB	assert dep graph
    time: 0.000; rss: 464MB	serialize dep graph
  time: 2.990; rss: 464MB	codegen
    time: 0.000; rss: 464MB	llvm function passes [regex.daejiqas-cgu.13]
    time: 0.003; rss: 464MB	llvm module passes [regex.daejiqas-cgu.13]
    time: 0.211; rss: 460MB	codegen passes [regex.daejiqas-cgu.12]
    time: 0.000; rss: 460MB	llvm function passes [regex.daejiqas-cgu.14]
    time: 0.002; rss: 460MB	llvm module passes [regex.daejiqas-cgu.14]
    time: 0.260; rss: 462MB	codegen passes [regex.daejiqas-cgu.5]
    time: 0.153; rss: 463MB	codegen passes [regex.daejiqas-cgu.13]
    time: 0.272; rss: 470MB	codegen passes [regex.daejiqas-cgu.14]
    time: 1.617; rss: 482MB	codegen passes [regex.daejiqas-cgu.10]
  time: 3.135; rss: 479MB	LLVM passes
  time: 0.000; rss: 477MB	serialize work products
  time: 0.147; rss: 477MB	linking
Self profiling results for regex:

| Phase                                     | Time (ms)      | Time (%) | Queries        | Hits (%)
| ----------------------------------------- | -------------- | -------- | -------------- | --------
| Codegen                                   |           3690 |    46.60 |         124616 |    89.76
| TypeChecking                              |           2241 |    28.30 |        1008274 |    91.97
| Other                                     |           1448 |    18.29 |        1646307 |    94.67
| BorrowChecking                            |            200 |     2.53 |          13895 |    65.21
| Linking                                   |            163 |     2.06 |          17357 |    88.70
| Expansion                                 |            123 |     1.55 |              0 |     0.00
| Parsing                                   |             54 |     0.68 |              0 |     0.00

Optimization level: No
Incremental: off

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 15, 2019
@Zoxc
Copy link
Contributor

Zoxc commented Feb 21, 2019

You should benchmark before and after the changes, both with a parallel compiler build, doing multiple runs and look at the timing for the parallel section you are modifying, aka:

  time: 0.009; rss: 203MB	intrinsic checking
  time: 0.030; rss: 204MB	liveness checking
  time: 0.067; rss: 208MB	match checking
    time: 0.078; rss: 211MB	rvalue promotion
  time: 0.079; rss: 211MB	misc checking

It's a good idea to use less threads for benchmarking than cores you have, just to leave some for background processes.

@Centril Centril added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 23, 2019
@Centril
Copy link
Contributor

Centril commented Feb 23, 2019

Ping from triage, @TheDarkula, it seems like we're awaiting more profiling info?

@TheDarkula
Copy link
Contributor Author

@Centril I'll recompile and get back soon

@TheDarkula
Copy link
Contributor Author

Closing in favour of the upcoming #58679

@TheDarkula TheDarkula closed this Mar 3, 2019
Centril added a commit to Centril/rust that referenced this pull request Mar 9, 2019
…ister

Refactor passes and pass execution to be more parallel

For `syntex_syntax` (with 16 threads and 8 cores):
- Cuts `misc checking 1` from `0.096s` to `0.08325s`.
- Cuts `misc checking 2` from `0.3575s` to `0.2545s`.
- Cuts `misc checking 3` from `0.34625s` to `0.21375s`.
- Cuts `wf checking` from `0.3085s` to `0.05025s`.

Reduces overall execution time for `syntex_syntax` (with 8 threads and cores) from `4.92s` to `4.34s`.

Subsumes rust-lang#58494
Blocked on rust-lang#58250

r? @michaelwoerister
Centril added a commit to Centril/rust that referenced this pull request Mar 9, 2019
…ister

Refactor passes and pass execution to be more parallel

For `syntex_syntax` (with 16 threads and 8 cores):
- Cuts `misc checking 1` from `0.096s` to `0.08325s`.
- Cuts `misc checking 2` from `0.3575s` to `0.2545s`.
- Cuts `misc checking 3` from `0.34625s` to `0.21375s`.
- Cuts `wf checking` from `0.3085s` to `0.05025s`.

Reduces overall execution time for `syntex_syntax` (with 8 threads and cores) from `4.92s` to `4.34s`.

Subsumes rust-lang#58494
Blocked on rust-lang#58250

r? @michaelwoerister
Centril added a commit to Centril/rust that referenced this pull request Mar 9, 2019
…ister

Refactor passes and pass execution to be more parallel

For `syntex_syntax` (with 16 threads and 8 cores):
- Cuts `misc checking 1` from `0.096s` to `0.08325s`.
- Cuts `misc checking 2` from `0.3575s` to `0.2545s`.
- Cuts `misc checking 3` from `0.34625s` to `0.21375s`.
- Cuts `wf checking` from `0.3085s` to `0.05025s`.

Reduces overall execution time for `syntex_syntax` (with 8 threads and cores) from `4.92s` to `4.34s`.

Subsumes rust-lang#58494
Blocked on rust-lang#58250

r? @michaelwoerister
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants